Dataset statistics
| Number of variables | 27 |
|---|---|
| Number of observations | 6912 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.4 MiB |
| Average record size in memory | 216.0 B |
Variable types
| NUM | 21 |
|---|---|
| CAT | 5 |
| BOOL | 1 |
case_id is highly correlated with df_index | High correlation |
df_index is highly correlated with case_id | High correlation |
age_2016 is highly correlated with age | High correlation |
age is highly correlated with age_2016 | High correlation |
df_index has unique values | Unique |
case_id has unique values | Unique |
hgc_mother has 76 (1.1%) zeros | Zeros |
hgc_father has 120 (1.7%) zeros | Zeros |
major has 174 (2.5%) zeros | Zeros |
income_2016 has 1832 (26.5%) zeros | Zeros |
fam_net_worth has 603 (8.7%) zeros | Zeros |
ch_health_limit has 1103 (16.0%) zeros | Zeros |
fam_net_income has 210 (3.0%) zeros | Zeros |
marital has 1050 (15.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-02-03 08:03:43.728964 |
|---|---|
| Analysis finished | 2021-02-03 08:05:29.254354 |
| Duration | 1 minute and 45.53 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 6912 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5344.452546 |
|---|---|
| Minimum | 1 |
| Maximum | 12678 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 504.55 |
| Q1 | 2538.75 |
| median | 5065 |
| Q3 | 7948.25 |
| 95-th percentile | 11666.6 |
| Maximum | 12678 |
| Range | 12677 |
| Interquartile range (IQR) | 5409.5 |
Descriptive statistics
| Standard deviation | 3328.338466 |
|---|---|
| Coefficient of variation (CV) | 0.6227650891 |
| Kurtosis | -0.9109141143 |
| Mean | 5344.452546 |
| Median Absolute Deviation (MAD) | 2661.5 |
| Skewness | 0.2955537234 |
| Sum | 36940856 |
| Variance | 11077836.94 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 8196 | 1 | < 0.1% | |
| 629 | 1 | < 0.1% | |
| 4751 | 1 | < 0.1% | |
| 6798 | 1 | < 0.1% | |
| 653 | 1 | < 0.1% | |
| 2700 | 1 | < 0.1% | |
| 4743 | 1 | < 0.1% | |
| 645 | 1 | < 0.1% | |
| 2692 | 1 | < 0.1% | |
| 6782 | 1 | < 0.1% | |
| Other values (6902) | 6902 | 99.9% |
| Value | Count | Frequency (%) | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 5 | 1 | < 0.1% | |
| 7 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12678 | 1 | < 0.1% | |
| 12666 | 1 | < 0.1% | |
| 12662 | 1 | < 0.1% | |
| 12658 | 1 | < 0.1% | |
| 12647 | 1 | < 0.1% |
| Distinct | 6912 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5345.452546 |
|---|---|
| Minimum | 2 |
| Maximum | 12679 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 505.55 |
| Q1 | 2539.75 |
| median | 5066 |
| Q3 | 7949.25 |
| 95-th percentile | 11667.6 |
| Maximum | 12679 |
| Range | 12677 |
| Interquartile range (IQR) | 5409.5 |
Descriptive statistics
| Standard deviation | 3328.338466 |
|---|---|
| Coefficient of variation (CV) | 0.6226485854 |
| Kurtosis | -0.9109141143 |
| Mean | 5345.452546 |
| Median Absolute Deviation (MAD) | 2661.5 |
| Skewness | 0.2955537234 |
| Sum | 36947768 |
| Variance | 11077836.94 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) | |
| 2049 | 1 | < 0.1% | |
| 6710 | 1 | < 0.1% | |
| 597 | 1 | < 0.1% | |
| 4687 | 1 | < 0.1% | |
| 6734 | 1 | < 0.1% | |
| 589 | 1 | < 0.1% | |
| 4679 | 1 | < 0.1% | |
| 2628 | 1 | < 0.1% | |
| 4671 | 1 | < 0.1% | |
| 6718 | 1 | < 0.1% | |
| Other values (6902) | 6902 | 99.9% |
| Value | Count | Frequency (%) | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% | |
| 8 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 12679 | 1 | < 0.1% | |
| 12667 | 1 | < 0.1% | |
| 12663 | 1 | < 0.1% | |
| 12659 | 1 | < 0.1% | |
| 12648 | 1 | < 0.1% |
| Distinct | 9 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.56409144 |
|---|---|
| Minimum | 14 |
| Maximum | 22 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 14 |
|---|---|
| 5-th percentile | 14 |
| Q1 | 16 |
| median | 17 |
| Q3 | 19 |
| 95-th percentile | 21 |
| Maximum | 22 |
| Range | 8 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.253491559 |
|---|---|
| Coefficient of variation (CV) | 0.1283010606 |
| Kurtosis | -1.060070832 |
| Mean | 17.56409144 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1381518795 |
| Sum | 121403 |
| Variance | 5.078224207 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 15 | 984 | 14.2% | |
| 18 | 968 | 14.0% | |
| 17 | 953 | 13.8% | |
| 16 | 952 | 13.8% | |
| 19 | 816 | 11.8% | |
| 21 | 744 | 10.8% | |
| 20 | 720 | 10.4% | |
| 14 | 599 | 8.7% | |
| 22 | 176 | 2.5% |
| Value | Count | Frequency (%) | |
| 14 | 599 | 8.7% | |
| 15 | 984 | 14.2% | |
| 16 | 952 | 13.8% | |
| 17 | 953 | 13.8% | |
| 18 | 968 | 14.0% |
| Value | Count | Frequency (%) | |
| 22 | 176 | 2.5% | |
| 21 | 744 | 10.8% | |
| 20 | 720 | 10.4% | |
| 19 | 816 | 11.8% | |
| 18 | 968 | 14.0% |
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.81486026 |
|---|---|
| Minimum | 0 |
| Maximum | 20 |
| Zeros | 76 |
| Zeros (%) | 1.1% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 10 |
| median | 12 |
| Q3 | 12 |
| 95-th percentile | 16 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 3.147313988 |
|---|---|
| Coefficient of variation (CV) | 0.2910175363 |
| Kurtosis | 1.654398428 |
| Mean | 10.81486026 |
| Median Absolute Deviation (MAD) | 1.130914295 |
| Skewness | -0.8700925225 |
| Sum | 74752.31411 |
| Variance | 9.905585339 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 12 | 2536 | 36.7% | |
| 11 | 612 | 8.9% | |
| 10 | 551 | 8.0% | |
| 10.8690857 | 433 | 6.3% | |
| 8 | 414 | 6.0% | |
| 9 | 386 | 5.6% | |
| 16 | 365 | 5.3% | |
| 14 | 325 | 4.7% | |
| 6 | 238 | 3.4% | |
| 13 | 210 | 3.0% | |
| Other values (12) | 842 | 12.2% |
| Value | Count | Frequency (%) | |
| 0 | 76 | 1.1% | |
| 1 | 16 | 0.2% | |
| 2 | 57 | 0.8% | |
| 3 | 109 | 1.6% | |
| 4 | 107 | 1.5% |
| Value | Count | Frequency (%) | |
| 20 | 10 | 0.1% | |
| 19 | 9 | 0.1% | |
| 18 | 48 | 0.7% | |
| 17 | 59 | 0.9% | |
| 16 | 365 | 5.3% |
| Distinct | 22 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.90467495 |
|---|---|
| Minimum | 0 |
| Maximum | 20 |
| Zeros | 120 |
| Zeros (%) | 1.7% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 9 |
| median | 11 |
| Q3 | 12 |
| 95-th percentile | 16 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 3.671350307 |
|---|---|
| Coefficient of variation (CV) | 0.3366767304 |
| Kurtosis | 1.066876792 |
| Mean | 10.90467495 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.507521343 |
| Sum | 75373.11324 |
| Variance | 13.47881308 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 12 | 2002 | 29.0% | |
| 10.94742647 | 1044 | 15.1% | |
| 16 | 477 | 6.9% | |
| 8 | 471 | 6.8% | |
| 10 | 400 | 5.8% | |
| 11 | 331 | 4.8% | |
| 9 | 307 | 4.4% | |
| 14 | 306 | 4.4% | |
| 6 | 261 | 3.8% | |
| 7 | 158 | 2.3% | |
| Other values (12) | 1155 | 16.7% |
| Value | Count | Frequency (%) | |
| 0 | 120 | 1.7% | |
| 1 | 27 | 0.4% | |
| 2 | 65 | 0.9% | |
| 3 | 129 | 1.9% | |
| 4 | 121 | 1.8% |
| Value | Count | Frequency (%) | |
| 20 | 95 | 1.4% | |
| 19 | 31 | 0.4% | |
| 18 | 126 | 1.8% | |
| 17 | 84 | 1.2% | |
| 16 | 477 | 6.9% |
sample_id
Real number (ℝ≥0)
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.006221065 |
|---|---|
| Minimum | 1 |
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 11 |
| 95-th percentile | 14 |
| Maximum | 20 |
| Range | 19 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 4.643392695 |
|---|---|
| Coefficient of variation (CV) | 0.6627528095 |
| Kurtosis | -1.268738049 |
| Mean | 7.006221065 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 0.2133219174 |
| Sum | 48427 |
| Variance | 21.56109572 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 5 | 1611 | 23.3% | |
| 1 | 1455 | 21.1% | |
| 13 | 835 | 12.1% | |
| 10 | 747 | 10.8% | |
| 14 | 523 | 7.6% | |
| 11 | 476 | 6.9% | |
| 7 | 305 | 4.4% | |
| 3 | 248 | 3.6% | |
| 8 | 159 | 2.3% | |
| 6 | 143 | 2.1% | |
| Other values (8) | 410 | 5.9% |
| Value | Count | Frequency (%) | |
| 1 | 1455 | 21.1% | |
| 2 | 137 | 2.0% | |
| 3 | 248 | 3.6% | |
| 4 | 134 | 1.9% | |
| 5 | 1611 | 23.3% |
| Value | Count | Frequency (%) | |
| 20 | 2 | < 0.1% | |
| 19 | 4 | 0.1% | |
| 18 | 4 | 0.1% | |
| 17 | 20 | 0.3% | |
| 16 | 52 | 0.8% |
sample_race
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 3 | |
|---|---|
| 2 | |
| 1 |
| Value | Count | Frequency (%) | |
| 3 | 3407 | 49.3% | |
| 2 | 2191 | 31.7% | |
| 1 | 1314 | 19.0% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
sample_sex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 2 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 2 | 3586 | 51.9% | |
| 1 | 3326 | 48.1% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
pov_1980
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 0 | |
|---|---|
| 0.2021668029 | |
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 4254 | 61.5% | |
| 0.2021668029 | 1539 | 22.3% | |
| 1 | 1119 | 16.2% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 19 |
|---|---|
| Median length | 3 |
| Mean length | 6.5625 |
| Min length | 3 |
asvab_math
Real number (ℝ≥0)
| Distinct | 533 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 471.6306706 |
|---|---|
| Minimum | 177 |
| Maximum | 797 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 177 |
|---|---|
| 5-th percentile | 313 |
| Q1 | 404 |
| median | 475.0400825 |
| Q3 | 535 |
| 95-th percentile | 639 |
| Maximum | 797 |
| Range | 620 |
| Interquartile range (IQR) | 131 |
Descriptive statistics
| Standard deviation | 98.87238576 |
|---|---|
| Coefficient of variation (CV) | 0.2096394317 |
| Kurtosis | 0.05913121182 |
| Mean | 471.6306706 |
| Median Absolute Deviation (MAD) | 66.0400825 |
| Skewness | 0.1522112201 |
| Sum | 3259911.195 |
| Variance | 9775.748667 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 475.0400825 | 429 | 6.2% | |
| 477 | 34 | 0.5% | |
| 491 | 33 | 0.5% | |
| 420 | 33 | 0.5% | |
| 435 | 32 | 0.5% | |
| 474 | 31 | 0.4% | |
| 486 | 30 | 0.4% | |
| 399 | 30 | 0.4% | |
| 446 | 29 | 0.4% | |
| 437 | 29 | 0.4% | |
| Other values (523) | 6202 | 89.7% |
| Value | Count | Frequency (%) | |
| 177 | 1 | < 0.1% | |
| 180 | 1 | < 0.1% | |
| 181 | 2 | < 0.1% | |
| 183 | 1 | < 0.1% | |
| 185 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 797 | 2 | < 0.1% | |
| 796 | 2 | < 0.1% | |
| 794 | 1 | < 0.1% | |
| 783 | 1 | < 0.1% | |
| 771 | 1 | < 0.1% |
asvab_word
Real number (ℝ≥0)
| Distinct | 496 |
|---|---|
| Distinct (%) | 7.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 469.9116967 |
|---|---|
| Minimum | 176 |
| Maximum | 796 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 176 |
|---|---|
| 5-th percentile | 314 |
| Q1 | 401 |
| median | 473 |
| Q3 | 535 |
| 95-th percentile | 653 |
| Maximum | 796 |
| Range | 620 |
| Interquartile range (IQR) | 134 |
Descriptive statistics
| Standard deviation | 99.52431485 |
|---|---|
| Coefficient of variation (CV) | 0.211793653 |
| Kurtosis | -0.1300461309 |
| Mean | 469.9116967 |
| Median Absolute Deviation (MAD) | 67 |
| Skewness | 0.1810439986 |
| Sum | 3248029.648 |
| Variance | 9905.089246 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 475.0717742 | 427 | 6.2% | |
| 674 | 42 | 0.6% | |
| 405 | 37 | 0.5% | |
| 400 | 35 | 0.5% | |
| 423 | 34 | 0.5% | |
| 456 | 34 | 0.5% | |
| 411 | 33 | 0.5% | |
| 681 | 32 | 0.5% | |
| 488 | 31 | 0.4% | |
| 493 | 30 | 0.4% | |
| Other values (486) | 6177 | 89.4% |
| Value | Count | Frequency (%) | |
| 176 | 1 | < 0.1% | |
| 180 | 1 | < 0.1% | |
| 182 | 2 | < 0.1% | |
| 183 | 2 | < 0.1% | |
| 189 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 796 | 1 | < 0.1% | |
| 780 | 2 | < 0.1% | |
| 748 | 1 | < 0.1% | |
| 735 | 4 | 0.1% | |
| 733 | 1 | < 0.1% |
| Distinct | 211 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1220.282827 |
|---|---|
| Minimum | 0 |
| Maximum | 9996 |
| Zeros | 174 |
| Zeros (%) | 2.5% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 501 |
| Q1 | 901 |
| median | 1233.85069 |
| Q3 | 1233.85069 |
| 95-th percentile | 2105 |
| Maximum | 9996 |
| Range | 9996 |
| Interquartile range (IQR) | 332.8506899 |
Descriptive statistics
| Standard deviation | 1075.312638 |
|---|---|
| Coefficient of variation (CV) | 0.8811995175 |
| Kurtosis | 47.71008019 |
| Mean | 1220.282827 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.324402937 |
| Sum | 8434594.899 |
| Variance | 1156297.27 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1233.85069 | 3912 | 56.6% | |
| 506 | 245 | 3.5% | |
| 501 | 179 | 2.6% | |
| 0 | 174 | 2.5% | |
| 502 | 146 | 2.1% | |
| 1203 | 143 | 2.1% | |
| 701 | 125 | 1.8% | |
| 514 | 101 | 1.5% | |
| 9996 | 76 | 1.1% | |
| 909 | 75 | 1.1% | |
| Other values (201) | 1736 | 25.1% |
| Value | Count | Frequency (%) | |
| 0 | 174 | 2.5% | |
| 101 | 7 | 0.1% | |
| 102 | 1 | < 0.1% | |
| 104 | 7 | 0.1% | |
| 106 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9996 | 76 | 1.1% | |
| 9994 | 1 | < 0.1% | |
| 4903 | 5 | 0.1% | |
| 4902 | 6 | 0.1% | |
| 4901 | 46 | 0.7% |
max_degree
Real number (ℝ≥0)
| Distinct | 9 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.77981613 |
|---|---|
| Minimum | 1 |
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1.787832788 |
| 95-th percentile | 4 |
| Maximum | 8 |
| Range | 7 |
| Interquartile range (IQR) | 0.7878327884 |
Descriptive statistics
| Standard deviation | 1.32198582 |
|---|---|
| Coefficient of variation (CV) | 0.7427653888 |
| Kurtosis | 7.287731324 |
| Mean | 1.77981613 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.497819464 |
| Sum | 12302.08909 |
| Variance | 1.747646509 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 3918 | 56.7% | |
| 1.787832788 | 1357 | 19.6% | |
| 4 | 571 | 8.3% | |
| 2 | 424 | 6.1% | |
| 3 | 389 | 5.6% | |
| 8 | 112 | 1.6% | |
| 5 | 110 | 1.6% | |
| 7 | 27 | 0.4% | |
| 6 | 4 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 3918 | 56.7% | |
| 1.787832788 | 1357 | 19.6% | |
| 2 | 424 | 6.1% | |
| 3 | 389 | 5.6% | |
| 4 | 571 | 8.3% |
| Value | Count | Frequency (%) | |
| 8 | 112 | 1.6% | |
| 7 | 27 | 0.4% | |
| 6 | 4 | 0.1% | |
| 5 | 110 | 1.6% | |
| 4 | 571 | 8.3% |
occup_2016
Real number (ℝ≥0)
| Distinct | 404 |
|---|---|
| Distinct (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4245.218674 |
|---|---|
| Minimum | 5 |
| Maximum | 9990 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 220 |
| Q1 | 2850 |
| median | 4245.218674 |
| Q3 | 5330 |
| 95-th percentile | 9130 |
| Maximum | 9990 |
| Range | 9985 |
| Interquartile range (IQR) | 2480 |
Descriptive statistics
| Standard deviation | 2382.358388 |
|---|---|
| Coefficient of variation (CV) | 0.5611862594 |
| Kurtosis | -0.1154171575 |
| Mean | 4245.218674 |
| Median Absolute Deviation (MAD) | 1154.781326 |
| Skewness | 0.225899676 |
| Sum | 29342951.47 |
| Variance | 5675631.488 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 4245.218674 | 1589 | 23.0% | |
| 9130 | 172 | 2.5% | |
| 430 | 144 | 2.1% | |
| 5700 | 127 | 1.8% | |
| 4220 | 122 | 1.8% | |
| 3130 | 91 | 1.3% | |
| 3600 | 89 | 1.3% | |
| 4700 | 89 | 1.3% | |
| 10 | 87 | 1.3% | |
| 4610 | 87 | 1.3% | |
| Other values (394) | 4315 | 62.4% |
| Value | Count | Frequency (%) | |
| 5 | 1 | < 0.1% | |
| 10 | 87 | 1.3% | |
| 20 | 47 | 0.7% | |
| 50 | 36 | 0.5% | |
| 60 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9990 | 4 | 0.1% | |
| 9840 | 6 | 0.1% | |
| 9750 | 3 | < 0.1% | |
| 9720 | 3 | < 0.1% | |
| 9640 | 11 | 0.2% |
css_worker
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.178147723 |
|---|---|
| Minimum | 1 |
| Maximum | 5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 2.178147723 |
| 95-th percentile | 4 |
| Maximum | 5 |
| Range | 4 |
| Interquartile range (IQR) | 0.1781477229 |
Descriptive statistics
| Standard deviation | 0.8251179498 |
|---|---|
| Coefficient of variation (CV) | 0.3788163406 |
| Kurtosis | 1.248158023 |
| Mean | 2.178147723 |
| Median Absolute Deviation (MAD) | 0.1781477229 |
| Skewness | 1.012791249 |
| Sum | 15055.35706 |
| Variance | 0.680819631 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2 | 2903 | 42.0% | |
| 2.178147723 | 1686 | 24.4% | |
| 1 | 1102 | 15.9% | |
| 4 | 700 | 10.1% | |
| 3 | 465 | 6.7% | |
| 5 | 56 | 0.8% |
| Value | Count | Frequency (%) | |
| 1 | 1102 | 15.9% | |
| 2 | 2903 | 42.0% | |
| 2.178147723 | 1686 | 24.4% | |
| 3 | 465 | 6.7% | |
| 4 | 700 | 10.1% |
| Value | Count | Frequency (%) | |
| 5 | 56 | 0.8% | |
| 4 | 700 | 10.1% | |
| 3 | 465 | 6.7% | |
| 2.178147723 | 1686 | 24.4% | |
| 2 | 2903 | 42.0% |
firm_size
Real number (ℝ≥0)
| Distinct | 235 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1333.7936 |
|---|---|
| Minimum | 1 |
| Maximum | 99995 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 35 |
| median | 400 |
| Q3 | 1333.7936 |
| 95-th percentile | 2000 |
| Maximum | 99995 |
| Range | 99994 |
| Interquartile range (IQR) | 1298.7936 |
Descriptive statistics
| Standard deviation | 6719.365873 |
|---|---|
| Coefficient of variation (CV) | 5.037785361 |
| Kurtosis | 180.0775199 |
| Mean | 1333.7936 |
| Median Absolute Deviation (MAD) | 397 |
| Skewness | 13.04553673 |
| Sum | 9219181.363 |
| Variance | 45149877.74 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1333.7936 | 2537 | 36.7% | |
| 100 | 214 | 3.1% | |
| 50 | 175 | 2.5% | |
| 1 | 174 | 2.5% | |
| 200 | 165 | 2.4% | |
| 30 | 146 | 2.1% | |
| 300 | 146 | 2.1% | |
| 40 | 127 | 1.8% | |
| 10 | 125 | 1.8% | |
| 20 | 121 | 1.8% | |
| Other values (225) | 2982 | 43.1% |
| Value | Count | Frequency (%) | |
| 1 | 174 | 2.5% | |
| 2 | 79 | 1.1% | |
| 3 | 106 | 1.5% | |
| 4 | 104 | 1.5% | |
| 5 | 87 | 1.3% |
| Value | Count | Frequency (%) | |
| 99995 | 25 | 0.4% | |
| 80000 | 1 | < 0.1% | |
| 70000 | 4 | 0.1% | |
| 68000 | 1 | < 0.1% | |
| 65000 | 1 | < 0.1% |
unemp_2016
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 4081 | 59.0% | |
| 0 | 2831 | 41.0% |
| Distinct | 656 |
|---|---|
| Distinct (%) | 9.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43395.67692 |
|---|---|
| Minimum | 0 |
| Maximum | 332954 |
| Zeros | 1832 |
| Zeros (%) | 26.5% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 31000 |
| Q3 | 59000 |
| 95-th percentile | 133000 |
| Maximum | 332954 |
| Range | 332954 |
| Interquartile range (IQR) | 59000 |
Descriptive statistics
| Standard deviation | 56850.92859 |
|---|---|
| Coefficient of variation (CV) | 1.310059725 |
| Kurtosis | 11.82603787 |
| Mean | 43395.67692 |
| Median Absolute Deviation (MAD) | 30273.5 |
| Skewness | 2.986494725 |
| Sum | 299950918.9 |
| Variance | 3232028082 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 0 | 1832 | 26.5% | |
| 43395.67692 | 282 | 4.1% | |
| 30000 | 157 | 2.3% | |
| 40000 | 150 | 2.2% | |
| 332954 | 144 | 2.1% | |
| 50000 | 126 | 1.8% | |
| 60000 | 114 | 1.6% | |
| 25000 | 109 | 1.6% | |
| 45000 | 105 | 1.5% | |
| 35000 | 100 | 1.4% | |
| Other values (646) | 3793 | 54.9% |
| Value | Count | Frequency (%) | |
| 0 | 1832 | 26.5% | |
| 1 | 3 | < 0.1% | |
| 9 | 1 | < 0.1% | |
| 10 | 1 | < 0.1% | |
| 31 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 332954 | 144 | 2.1% | |
| 199500 | 1 | < 0.1% | |
| 199000 | 1 | < 0.1% | |
| 198000 | 1 | < 0.1% | |
| 194000 | 2 | < 0.1% |
| Distinct | 2980 |
|---|---|
| Distinct (%) | 43.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 423345.8911 |
|---|---|
| Minimum | 0 |
| Maximum | 5526252 |
| Zeros | 603 |
| Zeros (%) | 8.7% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 33075 |
| median | 235150 |
| Q3 | 423345.8911 |
| 95-th percentile | 1500726.65 |
| Maximum | 5526252 |
| Range | 5526252 |
| Interquartile range (IQR) | 390270.8911 |
Descriptive statistics
| Standard deviation | 802878.7071 |
|---|---|
| Coefficient of variation (CV) | 1.896507617 |
| Kurtosis | 26.13216935 |
| Mean | 423345.8911 |
| Median Absolute Deviation (MAD) | 188195.8911 |
| Skewness | 4.780753517 |
| Sum | 2926166799 |
| Variance | 6.446142183e+11 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 423345.8911 | 1533 | 22.2% | |
| 0 | 603 | 8.7% | |
| 5526252 | 119 | 1.7% | |
| 2000 | 38 | 0.5% | |
| 5000 | 36 | 0.5% | |
| 1000 | 35 | 0.5% | |
| 4000 | 25 | 0.4% | |
| 6000 | 25 | 0.4% | |
| 3000 | 25 | 0.4% | |
| 10000 | 22 | 0.3% | |
| Other values (2970) | 4451 | 64.4% |
| Value | Count | Frequency (%) | |
| 0 | 603 | 8.7% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 6 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 5526252 | 119 | 1.7% | |
| 3132500 | 1 | < 0.1% | |
| 3129000 | 1 | < 0.1% | |
| 3128409 | 1 | < 0.1% | |
| 3119789 | 1 | < 0.1% |
| Distinct | 39 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.59770959 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 1103 |
| Zeros (%) | 16.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 10 |
| median | 40 |
| Q3 | 60 |
| 95-th percentile | 100 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 50 |
Descriptive statistics
| Standard deviation | 33.70671493 |
|---|---|
| Coefficient of variation (CV) | 0.8302614918 |
| Kurtosis | -1.031030002 |
| Mean | 40.59770959 |
| Median Absolute Deviation (MAD) | 30 |
| Skewness | 0.4667355731 |
| Sum | 280611.3687 |
| Variance | 1136.142631 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 50 | 1299 | 18.8% | |
| 0 | 1103 | 16.0% | |
| 100 | 892 | 12.9% | |
| 10 | 866 | 12.5% | |
| 20 | 607 | 8.8% | |
| 80 | 377 | 5.5% | |
| 30 | 350 | 5.1% | |
| 40 | 225 | 3.3% | |
| 25 | 187 | 2.7% | |
| 60 | 186 | 2.7% | |
| Other values (29) | 820 | 11.9% |
| Value | Count | Frequency (%) | |
| 0 | 1103 | 16.0% | |
| 1 | 40 | 0.6% | |
| 2 | 17 | 0.2% | |
| 3 | 9 | 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 892 | 12.9% | |
| 99 | 7 | 0.1% | |
| 98 | 1 | < 0.1% | |
| 97 | 1 | < 0.1% | |
| 95 | 18 | 0.3% |
fam_size
Real number (ℝ≥0)
| Distinct | 13 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.380353009 |
|---|---|
| Minimum | 1 |
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 13 |
| Range | 12 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.319195456 |
|---|---|
| Coefficient of variation (CV) | 0.5542016041 |
| Kurtosis | 3.207504651 |
| Mean | 2.380353009 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.375227509 |
| Sum | 16453 |
| Variance | 1.740276652 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 2 | 2519 | 36.4% | |
| 1 | 1887 | 27.3% | |
| 3 | 1275 | 18.4% | |
| 4 | 751 | 10.9% | |
| 5 | 305 | 4.4% | |
| 6 | 104 | 1.5% | |
| 7 | 41 | 0.6% | |
| 8 | 20 | 0.3% | |
| 9 | 4 | 0.1% | |
| 11 | 2 | < 0.1% | |
| Other values (3) | 4 | 0.1% |
| Value | Count | Frequency (%) | |
| 1 | 1887 | 27.3% | |
| 2 | 2519 | 36.4% | |
| 3 | 1275 | 18.4% | |
| 4 | 751 | 10.9% | |
| 5 | 305 | 4.4% |
| Value | Count | Frequency (%) | |
| 13 | 1 | < 0.1% | |
| 12 | 1 | < 0.1% | |
| 11 | 2 | < 0.1% | |
| 10 | 2 | < 0.1% | |
| 9 | 4 | 0.1% |
| Distinct | 2489 |
|---|---|
| Distinct (%) | 36.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 91078.70984 |
|---|---|
| Minimum | 0 |
| Maximum | 922631 |
| Zeros | 210 |
| Zeros (%) | 3.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2654.4 |
| Q1 | 30000 |
| median | 70898 |
| Q3 | 98000 |
| 95-th percentile | 228045 |
| Maximum | 922631 |
| Range | 922631 |
| Interquartile range (IQR) | 68000 |
Descriptive statistics
| Standard deviation | 127057.5076 |
|---|---|
| Coefficient of variation (CV) | 1.395029726 |
| Kurtosis | 29.66085923 |
| Mean | 91078.70984 |
| Median Absolute Deviation (MAD) | 36898 |
| Skewness | 5.023482013 |
| Sum | 629536042.4 |
| Variance | 1.614361024e+10 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 91078.70984 | 936 | 13.5% | |
| 0 | 210 | 3.0% | |
| 922631 | 122 | 1.8% | |
| 30000 | 72 | 1.0% | |
| 60000 | 69 | 1.0% | |
| 40000 | 68 | 1.0% | |
| 50000 | 58 | 0.8% | |
| 45000 | 54 | 0.8% | |
| 100000 | 48 | 0.7% | |
| 25000 | 48 | 0.7% | |
| Other values (2479) | 5227 | 75.6% |
| Value | Count | Frequency (%) | |
| 0 | 210 | 3.0% | |
| 25 | 1 | < 0.1% | |
| 31 | 1 | < 0.1% | |
| 36 | 1 | < 0.1% | |
| 50 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 922631 | 122 | 1.8% | |
| 350000 | 4 | 0.1% | |
| 348000 | 1 | < 0.1% | |
| 347000 | 1 | < 0.1% | |
| 345000 | 1 | < 0.1% |
fam_pov
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 0 | |
|---|---|
| 1 | |
| 0.1703480589 |
| Value | Count | Frequency (%) | |
| 0 | 4958 | 71.7% | |
| 1 | 1018 | 14.7% | |
| 0.1703480589 | 936 | 13.5% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 19 |
|---|---|
| Median length | 3 |
| Mean length | 5.166666667 |
| Min length | 3 |
region
Real number (ℝ≥0)
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.67810219 |
|---|---|
| Minimum | 1 |
| Maximum | 4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 4 |
| Range | 3 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9467989256 |
|---|---|
| Coefficient of variation (CV) | 0.3535335318 |
| Kurtosis | -0.7596751435 |
| Mean | 2.67810219 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.3461576985 |
| Sum | 18511.04234 |
| Variance | 0.8964282055 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 3 | 2963 | 42.9% | |
| 2 | 1537 | 22.2% | |
| 4 | 1344 | 19.4% | |
| 1 | 1006 | 14.6% | |
| 2.67810219 | 62 | 0.9% |
| Value | Count | Frequency (%) | |
| 1 | 1006 | 14.6% | |
| 2 | 1537 | 22.2% | |
| 2.67810219 | 62 | 0.9% | |
| 3 | 2963 | 42.9% | |
| 4 | 1344 | 19.4% |
| Value | Count | Frequency (%) | |
| 4 | 1344 | 19.4% | |
| 3 | 2963 | 42.9% | |
| 2.67810219 | 62 | 0.9% | |
| 2 | 1537 | 22.2% | |
| 1 | 1006 | 14.6% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.53587963 |
|---|---|
| Minimum | 0 |
| Maximum | 6 |
| Zeros | 1050 |
| Zeros (%) | 15.2% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 3 |
| 95-th percentile | 3 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.316343282 |
|---|---|
| Coefficient of variation (CV) | 0.8570614891 |
| Kurtosis | 2.417756092 |
| Mean | 1.53587963 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.446602382 |
| Sum | 10616 |
| Variance | 1.732759637 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 1 | 3692 | 53.4% | |
| 3 | 1588 | 23.0% | |
| 0 | 1050 | 15.2% | |
| 2 | 333 | 4.8% | |
| 6 | 249 | 3.6% |
| Value | Count | Frequency (%) | |
| 0 | 1050 | 15.2% | |
| 1 | 3692 | 53.4% | |
| 2 | 333 | 4.8% | |
| 3 | 1588 | 23.0% | |
| 6 | 249 | 3.6% |
| Value | Count | Frequency (%) | |
| 6 | 249 | 3.6% | |
| 3 | 1588 | 23.0% | |
| 2 | 333 | 4.8% | |
| 1 | 3692 | 53.4% | |
| 0 | 1050 | 15.2% |
urban_rural
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.0 KiB |
| 1 | |
|---|---|
| 0 | |
| 2 | 88 |
| 0.8040875912 | 62 |
| Value | Count | Frequency (%) | |
| 1 | 5332 | 77.1% | |
| 0 | 1430 | 20.7% | |
| 2 | 88 | 1.3% | |
| 0.8040875912 | 62 | 0.9% |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 3 |
| Mean length | 3.134548611 |
| Min length | 3 |
| Distinct | 9 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 54.56409144 |
|---|---|
| Minimum | 51 |
| Maximum | 59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 54.0 KiB |
Quantile statistics
| Minimum | 51 |
|---|---|
| 5-th percentile | 51 |
| Q1 | 53 |
| median | 54 |
| Q3 | 56 |
| 95-th percentile | 58 |
| Maximum | 59 |
| Range | 8 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.253491559 |
|---|---|
| Coefficient of variation (CV) | 0.04129990072 |
| Kurtosis | -1.060070832 |
| Mean | 54.56409144 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1381518795 |
| Sum | 377147 |
| Variance | 5.078224207 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) | |
| 52 | 984 | 14.2% | |
| 55 | 968 | 14.0% | |
| 54 | 953 | 13.8% | |
| 53 | 952 | 13.8% | |
| 56 | 816 | 11.8% | |
| 58 | 744 | 10.8% | |
| 57 | 720 | 10.4% | |
| 51 | 599 | 8.7% | |
| 59 | 176 | 2.5% |
| Value | Count | Frequency (%) | |
| 51 | 599 | 8.7% | |
| 52 | 984 | 14.2% | |
| 53 | 952 | 13.8% | |
| 54 | 953 | 13.8% | |
| 55 | 968 | 14.0% |
| Value | Count | Frequency (%) | |
| 59 | 176 | 2.5% | |
| 58 | 744 | 10.8% | |
| 57 | 720 | 10.4% | |
| 56 | 816 | 11.8% | |
| 55 | 968 | 14.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | case_id | age | hgc_mother | hgc_father | sample_id | sample_race | sample_sex | pov_1980 | asvab_math | asvab_word | major | max_degree | occup_2016 | css_worker | firm_size | unemp_2016 | income_2016 | fam_net_worth | ch_health_limit | fam_size | fam_net_income | fam_pov | region | marital | urban_rural | age_2016 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | 20 | 5.0 | 8.0 | 5 | 3 | 2 | 0.000000 | 377.000000 | 369.000000 | 1233.85069 | 1.000000 | 4020.000000 | 1.000000 | 300.0000 | 1.0 | 23000.0 | 4.233459e+05 | 40.59771 | 4.0 | 91078.709839 | 0.170348 | 1.0 | 1.0 | 1.0 | 57 |
| 1 | 2 | 3 | 17 | 10.0 | 12.0 | 5 | 3 | 2 | 0.202167 | 475.040082 | 475.071774 | 1233.85069 | 1.787833 | 4320.000000 | 2.000000 | 50.0000 | 1.0 | 29000.0 | 1.150000e+05 | 20.00000 | 2.0 | 79000.000000 | 0.000000 | 4.0 | 1.0 | 1.0 | 54 |
| 2 | 3 | 4 | 16 | 11.0 | 12.0 | 5 | 3 | 2 | 0.000000 | 493.000000 | 544.000000 | 514.00000 | 1.000000 | 2630.000000 | 4.000000 | 1333.7936 | 0.0 | 73000.0 | 1.128500e+05 | 70.00000 | 2.0 | 144600.000000 | 0.000000 | 4.0 | 1.0 | 1.0 | 53 |
| 3 | 5 | 6 | 18 | 12.0 | 12.0 | 1 | 3 | 1 | 0.000000 | 681.000000 | 692.000000 | 414.00000 | 4.000000 | 2710.000000 | 1.000000 | 30000.0000 | 1.0 | 115000.0 | 9.582500e+04 | 30.00000 | 4.0 | 169010.000000 | 0.000000 | 2.0 | 1.0 | 1.0 | 55 |
| 4 | 7 | 8 | 20 | 9.0 | 6.0 | 6 | 3 | 2 | 0.000000 | 428.000000 | 681.000000 | 1233.85069 | 1.000000 | 830.000000 | 2.000000 | 200.0000 | 1.0 | 47000.0 | 1.727170e+05 | 40.00000 | 4.0 | 47000.000000 | 0.000000 | 1.0 | 1.0 | 1.0 | 57 |
| 5 | 8 | 9 | 15 | 12.0 | 10.0 | 1 | 3 | 1 | 0.000000 | 536.000000 | 532.000000 | 501.00000 | 2.000000 | 10.000000 | 4.000000 | 1333.7936 | 1.0 | 90000.0 | 6.790000e+05 | 0.00000 | 3.0 | 200000.000000 | 0.000000 | 1.0 | 1.0 | 1.0 | 52 |
| 6 | 13 | 14 | 15 | 12.0 | 12.0 | 5 | 3 | 2 | 0.000000 | 653.000000 | 604.000000 | 2204.00000 | 4.000000 | 800.000000 | 2.000000 | 1333.7936 | 0.0 | 70000.0 | 1.570800e+06 | 0.00000 | 1.0 | 72500.000000 | 0.000000 | 1.0 | 3.0 | 1.0 | 52 |
| 7 | 14 | 15 | 15 | 12.0 | 12.0 | 1 | 3 | 1 | 0.000000 | 567.000000 | 651.000000 | 2207.00000 | 3.000000 | 4245.218674 | 2.178148 | 1333.7936 | 0.0 | 0.0 | 5.526252e+06 | 70.00000 | 6.0 | 922631.000000 | 0.000000 | 1.0 | 1.0 | 1.0 | 52 |
| 8 | 15 | 16 | 20 | 12.0 | 12.0 | 5 | 3 | 2 | 0.000000 | 510.000000 | 489.000000 | 1402.00000 | 1.000000 | 5700.000000 | 2.000000 | 1000.0000 | 1.0 | 100000.0 | 4.233459e+05 | 20.00000 | 2.0 | 107300.000000 | 0.000000 | 1.0 | 3.0 | 1.0 | 57 |
| 9 | 16 | 17 | 22 | 12.0 | 15.0 | 1 | 3 | 1 | 0.000000 | 637.000000 | 572.000000 | 4901.00000 | 1.000000 | 6660.000000 | 1.000000 | 50.0000 | 1.0 | 73000.0 | 7.500000e+03 | 50.00000 | 1.0 | 73000.000000 | 0.000000 | 1.0 | 3.0 | 1.0 | 59 |
Last rows
| df_index | case_id | age | hgc_mother | hgc_father | sample_id | sample_race | sample_sex | pov_1980 | asvab_math | asvab_word | major | max_degree | occup_2016 | css_worker | firm_size | unemp_2016 | income_2016 | fam_net_worth | ch_health_limit | fam_size | fam_net_income | fam_pov | region | marital | urban_rural | age_2016 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6902 | 12565 | 12566 | 19 | 12.000000 | 11.000000 | 19 | 2 | 2 | 0.202167 | 475.040082 | 475.071774 | 1203.00000 | 1.0 | 4245.218674 | 2.178148 | 1333.7936 | 0.0 | 0.0 | 88000.000000 | 100.0 | 2.0 | 38400.0 | 0.0 | 3.0 | 3.0 | 1.0 | 56 |
| 6903 | 12587 | 12588 | 20 | 14.000000 | 10.947426 | 19 | 2 | 2 | 0.202167 | 475.040082 | 475.071774 | 1233.85069 | 1.0 | 4610.000000 | 2.000000 | 1333.7936 | 1.0 | 22000.0 | 423345.891058 | 50.0 | 1.0 | 22000.0 | 0.0 | 3.0 | 0.0 | 0.0 | 57 |
| 6904 | 12588 | 12589 | 22 | 12.000000 | 10.947426 | 16 | 2 | 1 | 0.000000 | 475.040082 | 475.071774 | 501.00000 | 1.0 | 7340.000000 | 2.000000 | 8.0000 | 1.0 | 28900.0 | 249280.000000 | 10.0 | 2.0 | 86900.0 | 0.0 | 3.0 | 1.0 | 1.0 | 59 |
| 6905 | 12593 | 12594 | 21 | 12.000000 | 10.947426 | 16 | 2 | 1 | 0.202167 | 475.040082 | 475.071774 | 1233.85069 | 1.0 | 3920.000000 | 2.000000 | 1333.7936 | 0.0 | 0.0 | 0.000000 | 70.0 | 3.0 | 0.0 | 1.0 | 3.0 | 2.0 | 1.0 | 58 |
| 6906 | 12642 | 12643 | 22 | 10.869086 | 12.000000 | 16 | 2 | 1 | 0.000000 | 475.040082 | 475.071774 | 1233.85069 | 1.0 | 4245.218674 | 2.178148 | 1333.7936 | 0.0 | 0.0 | 26900.000000 | 100.0 | 1.0 | 6112.0 | 1.0 | 1.0 | 0.0 | 1.0 | 59 |
| 6907 | 12647 | 12648 | 20 | 16.000000 | 12.000000 | 16 | 2 | 1 | 0.000000 | 475.040082 | 475.071774 | 909.00000 | 1.0 | 1000.000000 | 1.000000 | 1000.0000 | 1.0 | 80000.0 | 423345.891058 | 0.0 | 4.0 | 80000.0 | 0.0 | 3.0 | 1.0 | 1.0 | 57 |
| 6908 | 12658 | 12659 | 21 | 5.000000 | 10.947426 | 16 | 2 | 1 | 0.000000 | 475.040082 | 475.071774 | 506.00000 | 3.0 | 4250.000000 | 4.000000 | 1333.7936 | 1.0 | 44000.0 | 423345.891058 | 50.0 | 2.0 | 81500.0 | 0.0 | 2.0 | 1.0 | 1.0 | 58 |
| 6909 | 12662 | 12663 | 21 | 9.000000 | 6.000000 | 15 | 3 | 1 | 0.202167 | 475.040082 | 475.071774 | 202.00000 | 1.0 | 9130.000000 | 2.000000 | 75.0000 | 1.0 | 75000.0 | 62000.000000 | 10.0 | 3.0 | 92000.0 | 0.0 | 3.0 | 1.0 | 1.0 | 58 |
| 6910 | 12666 | 12667 | 20 | 8.000000 | 10.947426 | 20 | 1 | 2 | 0.000000 | 475.040082 | 475.071774 | 2105.00000 | 1.0 | 4245.218674 | 2.178148 | 1333.7936 | 0.0 | 0.0 | 423345.891058 | 100.0 | 2.0 | 16094.0 | 0.0 | 3.0 | 3.0 | 1.0 | 57 |
| 6911 | 12678 | 12679 | 21 | 8.000000 | 8.000000 | 16 | 2 | 1 | 1.000000 | 475.040082 | 475.071774 | 1233.85069 | 1.0 | 4245.218674 | 2.178148 | 1333.7936 | 0.0 | 0.0 | 97000.000000 | 85.0 | 1.0 | 7200.0 | 1.0 | 3.0 | 1.0 | 0.0 | 58 |